Non-negative Matrix Factorization for Discrete Data with Hierarchical Side-Information
نویسندگان
چکیده
We present a probabilistic framework for efficient non-negative matrix factorization of discrete (count/binary) data with sideinformation. The side-information is given as a multi-level structure, taxonomy, or ontology, with nodes at each level being categorical-valued observations. For example, when modeling documents with a twolevel side-information (documents being at level-zero), level-one may represent (one or more) authors associated with each document and level-two may represent affiliations of each author. The model easily generalizes to more than two levels (or taxonomy/ontology of arbitrary depth). Our model can learn embeddings of entities present at each level in the data/sideinformation hierarchy (e.g., documents, authors, affiliations, in the previous example), with appropriate sharing of information across levels. The model also enjoys full local conjugacy, facilitating efficient Gibbs sampling for model inference. Inference cost scales in the number of non-zero entries in the data matrix, which is especially appealing for real-world massive but sparse matrices. We demonstrate the effectiveness of the model on several real-world data sets.
منابع مشابه
A new approach for building recommender system using non negative matrix factorization method
Nonnegative Matrix Factorization is a new approach to reduce data dimensions. In this method, by applying the nonnegativity of the matrix data, the matrix is decomposed into components that are more interrelated and divide the data into sections where the data in these sections have a specific relationship. In this paper, we use the nonnegative matrix factorization to decompose the user ratin...
متن کاملHierarchical Compound Poisson Factorization
Non-negative matrix factorization models based on a hierarchical Gamma-Poisson structure capture user and item behavior effectively in extremely sparse data sets, making them the ideal choice for collaborative filtering applications. Hierarchical Poisson factorization (HPF) in particular has proved successful for scalable recommendation systems with extreme sparsity. HPF, however, suffers from ...
متن کاملIterative Weighted Non-smooth Non-negative Matrix Factorization for Face Recognition
Non-negative Matrix Factorization (NMF) is a part-based image representation method. It comes from the intuitive idea that entire face image can be constructed by combining several parts. In this paper, we propose a framework for face recognition by finding localized, part-based representations, denoted “Iterative weighted non-smooth non-negative matrix factorization” (IWNS-NMF). A new cost fun...
متن کاملSemi-supervised Data Clustering with Coupled Non-negative Matrix Factorization: Sub-category Discovery of Noun Phrases in NELL’s Knowledge Base
The standard non-negative matrix factorization (NMF) is a popular method to obtain low-rank approximation of a non-negative matrix, which is also powerful for clustering and classification in machine learning. In NMF each data sample is represented by a vector of features of the same dimension. In practice, we often have good side information for a subset of data samples. These side information...
متن کاملReducing Dimensionality in Text Mining using Conjugate Gradients and Hybrid Cholesky Decomposition
Generally, data mining in larger datasets consists of certain limitations in identifying the relevant datasets for the given queries. The limitations include: lack of interaction in the required objective space, inability to handle the data sets or discrete variables in datasets, especially in the presence of missing variables and inability to classify the records as per the given query, and fi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016